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A POSTERIORI FINITE ELEMENT BOUNDS FOR SENSITIVITY DERIVATIVES OF 
PARTIAL-DIFFERENTIAL EQUATION OUTPUTS + 

ROBERT MICHAEL LEWIS t, ANTHONY T. PATERA * * * § , AND JAUME PERAIRE § 


Abstract. We present a Neumann-subproblem a posteriori finite element procedure for the efficient 
and accurate calculation of rigorous, “constant-free” upper and lower bounds for sensitivity derivatives of 
functionals of the solutions of partial differential equations. The design motivation for sensitivity derivative 
error control is discussed; the a posteriori finite element procedure is described; the asymptotic bounding 
properties and computational complexity of the method are summarized; and illustrative numerical results 
are presented. 
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Subject classification. Applied and Numerical Mathematics 

1. Introduction. We consider here an engineering system or component characterized by a design 
variable (or vector) (3. We assume that the behavior of this system can be adequately represented by the 
solution of an appropriate partial differential equation, or perhaps set of partial differential equations. A 
typical engineering “forward” analysis is thus initiated by the specification of the design variable /3; the partial 
differential equation then yields a field variable (or set of field variables) u(-;/3); finally, on the basis of these 
intermediate field variables, the output (s) s(/3) can be evaluated. Here “output” denotes the engineering 
quantity of interest, that is, the metric relevant to the performance of the system. For our purposes here, 
we assume that this output is a linear functional of the field variable, s{(3) ~ 

Of ultimate interest, of course, is not the forward problem, but the “design problem.” In brief, the design 
problem articulates the engineering objectives and constraints as a function of the outputs, and then seeks 
the best value of the design variable j3 with respect to the selected criteria. Successful solution of the design 
problem requires repeated appeal to the forward problem in order to calculate (i) the output s(/?), and (ii) 
the sensitivity derivative of the output with respect to the design variable, s f = dsfdfi. The sensitivity 
derivatives can be important both in informal and formal design optimization contexts: in the former, s' 
provides promising new search directions as well as an indication of design “robustness”; in the latter, s' 
provides the gradients required by (rapidly convergent) quasi- Newton methods [11, 12]. 

For most problems of engineering interest, the underlying partial differential equations are far too com- 
plex to admit analytical solution, and a numerical approximation must thus be introduced; we shall consider 
here finite element methods. The requirements on the numerical approximation are twofold: the approxima- 
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tion must be sufficiently coarse so as to permit repeated appeal within the design context; the approximation 
must be sufficiently fine so that the numerical prediction of the desired outputs and associated sensitivity 
derivatives is representative of the true performance of the system. 

A posteriori error control offers great promise in reconciling these often conflicting requirements. A 
posteriori analysis [3, 29] is composed of two critical ingredients: an estimation procedure which inexpensively 
assesses the error in a particular numerical solution; and an adaptive refinement procedure which exploits 
this error information to optimally improve the numerical solution. There are two objectives of a posteriori 
error control: to eliminate numerical uncertainty — arguably the single largest impediment to widespread 
adoption of simulation based design; and to improve the numerical efficiency of the forward and optimization 
problem thus permitting much more extensive design exploration. 

In fact, greater certainty is a prerequisite for greater efficiency: we may consider a less expensive (or 
even the least expensive) discretization only if the associated error can be quantified, and hence constrained 
and controlled; we are no longer compelled to choose cither certainty or efficiency — both can be achieved. 
Simultaneous control of approximation accuracy and approximation cost is particularly critical in the devel- 
opment of robust and effective design optimization procedures: for example, the accuracy of the sensitivity 
derivatives s r strongly affects both the convergence, and the convergence rate, of (say) trust region [9, 19] 
and line search [l 1, 20] quasi Newton techniques [28]. 

In all earlier a posteriori error analysis techniques, either — in implicit approaches — the measure 
of the error is not related to the actual engineering outputs of interest (c.g,, [15, 4, 2]), or in explicit 
approaches - the error estimates for the engineering outputs of interest involve numerous undetermined or 
uncertain constants or functions (e.g.,[5, 6, 27]); in both cases, quantitative confirmation and hence both 
certainty and efficiency — is seriously compromised, and the relevance to engineering design greatly reduced. 
In [21, 23, 25, 18] we propose a new class of a posteriori procedures that provide a critical new “enabling 
technology”: the ability to obtain inexpensive, sharp, rigorous, and quantitative (“const ant -free”) bounds 
for the numerical error in the engineering outputs of interest. Our method is thus directly relevant to the 
design process, and should lay the foundation for systemic application of a posteriori error control within 
the engineering context. 

Although our method provides a critical new capability, we arc nevertheless indebted to earlier a posteri- 
ori implicit (Neumann subproblem) techniques [15, 4, 2] for several important conceptual and mathematical 
ingredients in particular duality theory and flux “hybridization.” The former, though not strictly neces- 
sary — and even sometimes restrictive — provides a derivational mechanism without which the requisite 
equations are very difficult to motivate; the latter technically quite subtle — is the crucial component in 
ensuring computational efficiency. Our framework may thus be viewed as a generalization of earlier implicit 
techniques [2]; equivalently, these earlier techniques can be interpreted as special cases of our new formula- 
tion. We present in [25] a more detailed comparison between our approach and earlier implicit (and explicit) 
a posteriori error control proposals. 

Our initial formulation [23, 25, 24] focused on symmetric coercive problems (c.g., Poisson and Linear 
Elasticity), and nonsymmetric coercive problems (e.g., Convection Diffusion), as well as certain constrained 
problems (the Stokes equations, central to hydrodynamics [22]). However, we have recently developed a 
more general formulation [26, 17, 16] that greatly expands the class of equations and outputs that may be 
treated; in particular, noncoercive problems (e.g., the Helmholtz equation), nonlinear problems (e.g., the 
Burgers equation, the eigenvalue problem), and nonlinear outputs can now be addressed. The approach is 
relevant both to Galerkin techniques [17] and stabilized Galerkin methods [16]. 
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In this paper we extend our framework to the case in which we are interested not only in error bounds 
and adaptive procedures for the output s, but also in error bounds and adaptive procedures for the sensitivity 
derivative 5 '. As already indicated, effective control of the accuracy of sensitivity derivatives is crucial in 
ensuring both the quality and efficiency of engineering optimization procedures [9, 10]. It is thus surprising 
that there is relatively little work on a posteriori error estimation relevant to sensitivity derivatives [7], in 
particular since there is considerable debate over the best way — discrete or continuous [28, 13, 8, 14] — 
to calculate the sensitivity functions and adjoints required to compute s'. In this work we aim to at least 
partially address the paucity of a posteriori results for this important class of problems. 

In Section 2 we describe our model problem. For our purposes here we consider a simple one dimensional 
coercive differential equation: noncoercivc, semi linear, and multi- dimensional problems [17] require virtually 
no modifications to the framework, and will be addressed in future publications. Furthermore, in this first 
paper we focus exclusively on a posteriori error estimators for s'; once these estimators are obtained, the 
adaptive procedures described in [25, 16] can be directly applied. We next introduce in Section 3 the relevant 
finite clement spaces and approximations. In Section 4 we describe the a posteriori procedure; Section 5 
then discusses the computational complexity of this procedure and the asymptotic bounding properties of 
the resulting lower and upper estimators. Finally, in Section 6, we present a series of illustrative numerical 
results. 

2. Problem Statement. As our model problem we shall consider a second -order inhomogeneous 
Dirichlet problem for u(x;{3)> 


-v{0)u xx + p{0)u x + a{0)u = f in 


u(0) = 0, tt(l) = 1, 

where / is the prescribed inhomogeneity and ft =]0, 1[ is the domain. Here 1 /, p , and a are respectively strictly 
positive, general, and non negative real quantities which are independent of x but may depend on the design 
parameter (3 . (Extension of the framework to permit these coefficients to vary with x is straightforward, as 
will be clear from the exposition; examples will be presented in future publications.) Although we consider 
here the Dirichlet problem, Neumann problems can also be readily treated. 

Our point of departure shall be the weak formulation: Find u E X D such that 

(2.1) o(u > v) = (/,v), VveX, 

where X = Hq(H) and X D = i/^(ft). We recall that Hq( fi) is the space of functions v such that (i) v 
and v x are in L 2 (il) (that is, square integrable [l]), and (ii) v(Q) — v(l) = 0; similarly, i/^(0) is the set of 
functions v such that (i) v and v x are in L 2 (D), and (ii) t>(0) = 0, v(l) = 1. The bilinear form a(u;,i;) (or 
more precisely a(u?,i>;/3)) is given by 

( 2 . 2 ) a(w , v)= v(/3)w x v x + p(/3)w x v + a{(3)wv dx. 

J 0 

Finally, / E ff _1 (fl) (the dual of if^fl)), and (, ) thus denotes the duality pairing. 

Our interest is not directly in u{x\ /?), but rather in the output s(/3) — £{u{x\j3)) and the sensitivity 
derivative s'(/3). The latter may be computed as 

(2-3) s'(/3)=e(U(x;/3)), 
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where U = u'(x;P) = e X, the sensitivity function, satisfies 


(2.4) 


a(U,V) = (f\V)-a'(u,V), Wei, 


in which 
(2.5) 



v'{f3)w x v x + p'((3)w x v + a(/3)wv dx\ 


here ' denotes, as always, We could also readily permit the Dirichlet data to depend on /?; in this case 
U in (2.4) would satisfy inhomogeneous Dirichlet conditions. Finally, we note that the sensitivity function 
is only one of two approaches to the computation of sensitivity derivatives; the adjoint formulation will be 
considered in future publications. 

For the formulation presented here the computation of the sensitivity derivatives thus requires, first, the 
solution of (2.1) for u, then the solution of (2.4) for £/, and finally the evaluation of $'(/?) from (2.3): (2.1) 
and (2.4) are coupled but “triangular.” However, if the only dependence on /? is through / (or, in fact, the 
Dirichlet data) — that is, v f = pf = a' = 0, and hence o! — 0 — then the system is no longer coupled: we can 
compute U and hence s' independently of u and s . In this instance — which arises in many design contexts, 
including certain open loop control problems and shape optimization formulations we may directly apply 
our earlier bound procedures [23, 25, 17] to (2.4) and (2.3); as we shall sec, this “direct” approach is, in fact, 
a special case of the more general framework developed in this paper. 

Finally, we close this section by slightly generalizing our problem statement (2.1), (2.4), (2.3). In 
particular, we introduce a real positive parameter cr, and replace (2.1), (2.4), and (2.3) with 


( 2 . 6 ) 


a(u,v ) = (/, v), Vt> e X , 


(2.7) a(U, V) = V) - a'(u, V)), W e X, 
and 

(2.8) s'(/3) = -£(U), 

<y 

respectively. It is clear from the linearity of our output functional that (2.1), (2.4), (2.3) and (2.6), (2.7), (2.8) 
do indeed yield the same sensitivity derivative; this simply reflects the invariance of the sensitivity derivative 
to rescalings of the design parameter. However, (2.6), (2.7), (2.8) permits us to control the magnitude of a' 
relative to the magnitude of a in the coupled system (2.6), (2.7); this will be advantageous in the context of 
our error bound procedure. 

It will prove very convenient in what follows to write (2.6), (2.7) more succinctly in terms of a product 
space. In particular, we define Y D = X D <g> X , and Y = X x X, and introduce the bilinear form 

(2.9) £((«/, W) 9 (v, V)) = a{w , v) + a(W, V ) + aa'(w, V), 

for a and a ' as defined in (2.2) and (2.5), respectively. It is then simple to see that (u, U) e Y D satisfies 

(2.10) £((ti, U), (v, V )) - (/, v) + <r(f\ V ), V(v, V) G Y, 

since variations in v and V recreate (2.6) and (2.7), respectively. (Note that £ in (2.9) can be further 
generalized: the v and V contributions need not be assigned the same weight.) 
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3. Finite Element Approximation. We shall consider here only uniform meshes, although in practice 
(adaptive) non uniform meshes are preferred. We first introduce a uniform triangulation of fi, T$, comprising 
Ks elements Ts of uniform length 5 = K s 1 \ the corresponding Ns = Ks + 1 nodes of Th are denoted by 

i = (i — 1)£, i = 1, . . . , N&. We then define our linear finite element approximations spaces as 

(3.1) X 6 = {v\ Ts elPliTs), VT s eT s ] nl, 

(3.2) Xf = {v|t, e P i (T s ), VT S e Ts} n X D , 

where JPi (Ts) refers to the space of linear polynomials over Ts . Finally, we define our approximation product 
spaces as Y 6 D = Xf ® Xs and Y$ = Xs <S> Xs. 

Our finite element approximation (us, Us) G V s*,s' & G JR to (u,£/) G Y D ,s l G JR of (2.6), (2.7), (2.8) is 
then given by 

(3-3) £((us>Us), (v, V)) = (/, v) + a(f\ V) , V(v, V) G F*, 

and 

(3-4) s's = -£(£/*). 

<T 

Summarizing the a priori theory, it is readily demonstrated that the problems (2.10) for (u, U) and (3.3) for 
(us,Us) are well-posed, yielding unique solutions. It can be further shown that, given sufficient regularity, 
[|u — u$||i and \\U — U$\\i vanish as O(rf) , where || • ||i denotes the H 1 norm; in addition, application of 
Aubin-Nitsche theory to our product space formulation confirms that, as expected, ||u — u^fjo and ||J7 — Us\\ o 
vanish as 0(5 2 ), where || ■ ||o denotes the L 2 norm. Finally, Aubin-Nitsche theory also readily reveals that 
both | s - ss | and |s' - s f s \ vanish as 0(H 2 ) (here ss = £(us))- 

For the purposes of our bound formulation we shall be interested in two particular finite element spaces: 
a coarse, or design, approximation space, Xh = Xs^h] and a fine, or “truth,” approximation, X \ = Xs=h - 
We require that H/h is integral such that 7^ is a refinement of Th. The corresponding coarse and fine 
finite element approximations — solutions of ( 3 . 3 ), ( 3 . 4 ) — will be denoted (u H ,U H ) E Yfi, G M and 
(uh,Uh) G Y^,8 f h G iR, respectively. In brief, the coarse approximation is the approximation that we must 
use due to computational constraints; the fine approximation is the approximation that we would like to 
use — that is, the approximation on which we can (and will) safely assume that Uh,Sh, and s f h are 
all arbitrarily close to the corresponding exact quantities u,U,s y and s'. For future reference we define 
e = Uh — uh and E =Uh — TJh ; it follows from our a priori results that, for h sufficiently small compared to 
H y ||e||i and ||F||i vanish as 0(H), while ||e||o, ||i?||oj and \s' h — s r H \ vanish as 0(H 2 ). 

Our goal will be to obtain bounds for s f h — the sensitivity derivative on the fine mesh — at considerably 
less cost than direct computation of s' h = £(Uh)] indeed, the cost to compute the bounds will typically be only 
slightly greater than the cost to compute the coarse sensitivity derivative, s ' H . The coarse approximation will 
provide the “guesses” — ■ in fact, Lagrange multipliers which will ensure that these bounds are sufficiently 
accurate; in particular, given our a priori results, we require that the bounds approach s' h as 0(H 2 ). We 
will, indeed, achieve this desired (optimal) rate of convergence. 

4. Bound Formulation. 
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4.1. Preliminaries: Spaces and Forms. We first introduce coarse and fine broken spaces, 


(4.1) Xh = {v\t h £ JPi(Th), VT>/ e T H ] 

(4.2) X h = € Pi(T h ), WT h € % | v\ T „ € C 0 ^), VT„ e T„}, 

where C°(T//) is the space of continuous functions over X//; note that the fine broken space is continuous 
within each Th . We can thus form the associated broken product spaces Yh = Xh®Xh and Yh — 

We next introduce the hybrid flux space Q = M Nh , the members Qi of which are defined at the xni,i = 
1 , . . . , Nfj, the nodes of the coarse approximation; the corresponding product space is given by Z — Q (8) Q. 
We then define the bilinear form 6: YhipxYn) x Z — + 1R as 

n h n I{ 

(4.3) b((v, V), (q, Q)) = Y, Mn9n + £ M.Q». 

n=l n = 1 

where [it;] denotes the jump at the interfaces: for any 1 < i < Nh , Hn = ^(4J — w ( x Hn )> w ^ ere 
w ( x tin) (respectively, w(x]j n )) refers to the limit as x — ► XHn from the right (respectively, left); at the 
two endpoints, we define [ic]i = w(0) and [lujjvj, = ■— u;(l). The bilinear form imposes continuity (and 
homogeneous Dirichlet boundary conditions) on members of the broken spaces in the sense that 

(4.4) Yh = {(v 7 V)eY H \ b{(v, V ), (g, Q)) = 0, V(g, Q) e Z}, 

(4.5) Yh = {(v 9 V)eY h | b((v, V), (g, Q)) = 0, V(g, Q) £ Z}. 

Extension of these spaces and bilinear forms to the multidimensional case is discussed in detail in [23, 25, 18]. 

We shall also require several forms related to £ of (2.9). First, we define in standard fashion the 
symmetric part of 

(4.6) S' *{{w, W), (v, V )) = a>, v) + a 3 (W, V ) + |(a'(u», V) + a'(v, W)), 

where a s is the symmetric part of a — for our boundary conditions simply a with the convective term 
(p(/3)w x v) omitted. Wc then further decompose £ s = £y -f £* M \ we shall consider two such decompositions, 
Alternative A and Alternative B. In Alternative A we choose 

(4.7) £{,((w,W),(v,V)) = a s (w,v)+a s (W,V)+^ v'{w x V x + v x W x )dx, 

(4.8) S s M ((v,W)>(v,V))=^([ p'iwxV + v x W)dx + f a'(wV + v\V)dx). 

2 Jo Jo 

In Alternative B we choose for £y and £ S M 

^(( W ,iy),(n,y)) = a s (n;,t;)+a s (iy ( I/) + J f u' (w x V x + v x W x )dx 

2 Jo 

(4.9) + I' p'{w x V + v x W)dx + J \p'\ £ WV dx, 

2 Jo 2 J 0 

and 

(4.10) £* m ((w, W ), (v, V)) = | a’(wV + vW)dx - ^\p'\ jf* WV dx, 
respectively. 
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4 . 2 . Estimator Procedure. The estimator procedure comprises five steps [23, 18, 17], which we now 
describe. Note that we unfold some of the product forms into somewhat more transparent notation in the 
next section when we discuss computational complexity in greater detail. 

1. Compute ( uh,Uh ) € Y H from 

(4-11) S ((u H , U H ), (v, V)) = (f, v) + a </', V), V(v, V) G Y„ , 

and define the associated residual 

(4.12) n u H ((v, V )) = {/, v) + <r(f, V) - €((u h , U H ), (v, V )), 
for any (v, V ) in Yh- 

2. Compute the adjoint (ipH, ^ h) € Y/j from 

(4.13) €((v, V), (4 >h, )) = V(«, V) € Yh, 

(T 

and define the associated residual 

(4.14) n%((v, V)) = - £((V, V ), Wh, *„)), 

<J 

for any (v, V') in Yh- 

3. Find the hybrid fluxes (p-^, P 1 ^) G Z and (p^, P*) G Z such that 

(4-15) b((v > V),(p'* H ,pV))=n u H ((v,V)), V(t >,V)eY H , 

(4-16) b((v, V), (p$ t Pg)) = V)), V(«, V) € 

These equations have a unique solution thanks to equilibrium (4.11), (4.12); in higher space dimensions the 
matter is considerably more subtle [23, 18], though now well understood thanks to the important contribu- 
tions in [15, 4, 2]. 

4. Find the reconstructed errors (e u ,E u ) G G Yh such that 

(4.17) ^((^ t ^),(t» l V0) = ^((ti f v))-6((t,,n(p&,^)), v(®.v)er fc> 

(4.18) 2#((e* £*), (t», V)) = n%((v, V )) - fc(( w , V), (pt, Pg)), V(«, F) G y fc . 

The well posedness of this problem for our different choices of £y will be discussed shortly. 

5. Compute the lower (s'_) and upper (s' + ) estimators as $± ~s f ± A', where 

(4.19) s' = s' H - 2£y((e u , E u ), (e* £*)), 

(4.20) A' = 2 v /^((e u ,£ t/ ),(e“ ) ^))ff((e^,£:' I '),(e^,^)). 
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The difference between the upper and lower estimators — the bound gap, A ' — can be decomposed into 
elemental contributions; the latter then serve as local indicators for adaptive refinement [25, 16]. 

Note that, for any given cr, these estimators have already been optimized over the scaling parameter k of 
[25, 18, 17]. 

We now address the well posedness of Step 4; to begin, we consider Alternative A, with a = 0. We 
first note from (4.7) that (4.17), (4.18) is singular in this case: £f((, ),(v,V)) vanishes for {v,V)\ t u — 
(<*£*), VTk e Th , where the cJ H ,c% H are real constants. However, thanks to the definition of the 
hybrid fluxes, (4.15), (4.16), the problem is solvable; furthermore, it is clear from (4.19), (4.20) that the 
final bounds in Step 5 do not depend on the nullspace component chosen — though for theoretical purposes 
we should select the reconstructed errors to be of zero mean over each Th [18]. We next note that, for 
0 < a < 2i//|i/|, Sy is positive semi definite over Y/*, since 

£y ((v, V), (v,V)) — a s (v,v) -I- a s (F, V) + ov* [ v x V x dx 

Jo 

(4.21) > v J (v 2 x + V?)dx - ||i/| jf {vl + V?)dx; 
we have used here the standard inequality 

1 b 2 

(4.22) \ab\ < —(era 2 — ) 

for a and b real numbers and e = 1. It further follows from (4.21) that, first, (4.17), (4.18) is well posed 
and stable (coercive) over the nullspace excised quotient space, and second, the argument of the radical in 
Step 5 is always non negative. In short, the entire bound procedure is well-posed. The case in which a/0 
is, in fact, simpler: the problems are no longer singular, however the advantageous zero mean property still 
obtains [18]. 

Alternative B leads to a rather similar analysis. We presume that p r ^ 0, as otherwise Alternative A 
and Alternative B are identical; and we first consider the more difficult case, a = 0. We find from (4.9) 
that (4.17), (4.18) is again singular, however the nullspace is now different: £y((, ),(^,1 / )) vanishes for 
{v,V)\ Tn = (cp,0), VT// 6 T h , where the cj ir are real constants. However, due to the definition of the 
hybrid fluxes, (4.15), (4.16), the problem is solvable; furthermore, it is clear from (4.19), (4.20) that the 
final bounds in Step 5 do not depend on the nullspace component chosen — for theoretical reasons we select 
e u , to be of zero mean. (It can also be argued that the elemental mean of E v , E* , though not zero, will 
be suitably small.) We next note that, for 0 < a < 2v/{\v'\ + \(/\), £y is positive semi-definite over Y h , since 

£P((V, n (v, V )) > «/ J\vl + v*)dx - °-\v'\ J\vl + V?)dx 

(4.23) ( V l + V ^ dx+ ^\P'\ J o V2(ix ’ 

where we have again evoked (4.22) with e = 1. It follows, as for Alternative A, that first, (4.19), (4.20) is 
well -posed and stable (coercive) over the nullspace excised quotient space, and second, the argument of the 
radical in Step 5 is always non negative. The case in which a ^ 0 poses no problem: the problems are no 
longer singular, however the zero mean property for e u ,e^ still obtains. 

5. Estimator Properties. In order to be of interest, the estimators must enjoy certain properties: 
(i) the estimators s'_,s' + must correspond to bounds under appropriate hypotheses, and converge to the 
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true result s' h sufficiently quickly; and (ii) the estimators must be inexpensive to compute relative to £(Uh). 
We consider the former in Section 5.1, and the latter in Section 5.2. Note that, as regards computational 
complexity, we effectively anticipate multi dimensional problems. 

5.1. Bounding Properties. We consider only the lower estimator; similar results obtain for the upper 
estimator. To begin, we note that it follows directly from the general formulation of [17, 16] that, for both 
Alternative A and Alternative B, 

(5.1) s'_ = 4 " «*££((« -i,E-E),(e-e,E-E))- n*E s M ((e, E), (e, E)), 
where 

(5.2) (e, E) = (e u + ±e* , E u + \e% 

hi* K* 

and 

(5.3) k* = V /^((e^,^),(e^ 1 ^))/£;*,((e“ ) ^),(e“ ) ^)) 

is the optimal scaling parameter of [25, 17]. We assume that a is chosen such that E £ is positive- definite. 

Considering first the second — negative-definite — term of (5.1), it is clear from our a priori results 
for the H l error that this term should vanish no faster than 0(H 2 ). In fact, based on arguments similar 
to those developed in [17], it can be shown that, given sufficient regularity, this negative definite term will 
indeed vanish quadratically. We now turn to the third indefinite - term. For Alternative A we obtain 
from (4.8) and the inequality (4.22) that 

(5.4) \E° M ((e,E),(e,E))\ < f(|p'| [\ et £ + — )dx + |a'| f\e 2 + E 2 )dx). 

1 Jo e Jo 

From our a priori results for the H l and L 2 error it follows that, for the choice e = H, \E s M ((e,E), (e, J57))| 
will vanish at least as fast as 0(H 3 ). For Alternative B, (4.10), we obtain in a similar fashion that 

(5-5) \E° M ((e,E),(e,E))\ < ^(\a'\ j\e 2 + E 2 )dx + \p'\ £ E 2 dx), 

which, from our a priori results for the L 2 error, should vanish as 0(H 4 ). 

We can thus draw two conclusions. First, for H sufficiently small, the indefinite term in (5.1) will be 
subdominant (in fact, vanishingly small) relative to the principal (negative definite) contribution, and thus 
s r _ will approach s f h from below we obtain an asymptotic lower bound . Similar arguments demonstrate 
that s+ approaches s ; h from above, and thus is an asymptotic upper bound. For the case in which only 
i/' and /' are non zero, we obtain rigorous lower and upper bounds for all H\ in practice, we find that, 
even for p f and oJ non zero, it is very difficult not to obtain bounds — consistent with earlier applications 
of our bound procedure to noncoercive and nonlinear problems [26, 17, 16]. Second, we conclude that the 
estimators should converge to s f h at the optimal rate — 0(H 2 ) for our linear finite element approximation. 

5.2. Computational Complexity. There are two issues that must be addressed: first, how does the 
computational cost of s'_ (say) compare to that of s ( h = and second, how docs the complexity scale 

when we permit several outputs, s m = £ m (u),m = 1, . . . , M, and several design parameters (or “inputs”), 
PjJ = 1, . . . , J? The former is discussed in great detail in [23, 25, 18, 16]. The essential point is that the 
work on the fine mesh — Step 4 — is reduced to computations over the broken space Y^: it follows from 
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the superlinearity of most solution strategies (at least in higher space dimensions) [16] that computation of 
(e u , E u ) and (e^, E *) is much less expensive than computation of Uh> Uh indeed, often no more expensive 
than computation of Uh,U h . There are several other factors that further significantly decrease the cost of 
(e tt , E u ) and (e^, E*) relative to u h , U h \ the equations for (e u , E u ) and (e 1 ^, £?*) will be symmetric, linear, 
and positive definite even when the original operator enjoys none of these properties; and the equations 
for (e u ,E u ) and (c*,iS*) admit obvious — communication-free, completely concurrent — medium grained 
parallelism. 

Turning now to the multiple-output, multi pie- input question, our interest is in computing bounds for 
the MJ quantities (s h )™, the derivatives of the = £ m {u h ) with respect to the fy. We now address each 
step of the bound procedure. We shall assume that the usual nodal bases are evoked to transform the weak 
statements into appropriate linear algebraic systems, Ax = y; A and y shall be denoted the “matrix” and 
“right-hand side,” respectively. 

1. We first compute uh € Xfj, 

(5.6) a(uH , v) = {/, v), Vv e X H , 
and then (Uh) j € Xh , j = 1> • • > 

(5.7) a((U H )j,V) = cr((fj,V) - aj(u H ,V)), VF eX H} 

where cij(w,v) (respectively f 3 ) refers to the derivative of the bilinear form a (respectively /) with respect 
to flj. We must thus solve one system in (5.6), and J systems in (5.7). The J + 1 systems in (5.6), (5.7) 
share a common matrix; only the right hand side varies. 

2. We first compute ^ G X//, m = 1, . . . , M, 

(5.8) o(V,®3) = -b? m (n WeX H , 
and then (V 1 //)™ £ Xn,j = 1, . . . ,J,m = 1, . . . ,M, 

(5.9) a(t>, (V’/f)" 1 ) = -aj(n, E), Vv e 

We must thus solve M systems in (5.8), and MJ systems in (5.9). All M + MJ systems share a common 
matrix in fact the transpose of the matrix associated with (5.6), (5.7); only the right-hand side varies. 

3. We now compute the hybrid fluxes {p u H , P^)j e Z,j = 1, . . . , J, and (p^, PfDJ 1 € Z, j = 1, . . . , J,m = 
1, . . . ,M, from 

(5.10) b((v, V),{p u H ,P%) 3 ) = (n u H )j((v,V)), V(v,V) e Y h , 

(5.11) b((v,V), V(u, V) € Y h , 

where 

(fc&Wfa v )) = (/, v) + cr{fj,V) - SjttuH, (C/h),), (v, V)), 

= -ic(v)-f J -((t,,v) ) ((^)?,#S)) I 
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and £j is £ of (2.9) with a' replaced by a 3 . The systems (5.10) and (5.11) correspond to small, local problems 
— one problem (in fact four problems given the four hybrid fluxes) for each node of the coarse mesh. In the 
case of M outputs and J inputs we will now require 2 (J -f MJ) hybrid fluxes; in all 2 (J + MJ) cases the 
particular matrices associated with each node are identical. 

4. We find the reconstructed errors (e u ,E u ) 3 e Yh,j = 1,..., J, and (e^,#*)™ G Yh,j = 1 = 
1, . . . , Af, such that 

(5.12) = V(u,V)eY fc , 

(5.13) 2£l j ((e'f y E*)”',(v,V)) = (K*)?((v,V)) - b((v,V),(p%,P%)™), V(t>, V) 6 Y h , 

where £y ■ corresponds to £ y of (4.7) or (4.9) with a' replaced by aj. We must solve J systems in (5.12) 
and MJ systems in (5.13). The former correspond to J different matrices; the latter to the same set of J 
different matrices, each with M different right-hand sides. 

5. Finally, we compute the lower {{s-)™) and upper ((s + )™) estimators as ( s± )™ = (' s )™ -j- (A)™, j = 
1 , . . . , J, m = 1 , . . . , A/, where 

(5.14) ( s i)J* = (s„)f - .((e“, E u )j, (e*, E *)"), 

(5.15) (A)f = 2 

We now require MJ inner products; in any event, this calculation does not contribute significantly to the 
computational cost. 

We close with a brief summary. In the case of direct solution methods, in particular LU (e.g, skyline or 
banded) solution procedures, the penalty for additional outputs and inputs is relatively small: on the coarse 
mesh, no additional LU decompositions are required, only additional (much less expensive) forward/back 
solves; on the broken fine mesh, the penalty is somewhat more significant, in particular as regards multiple 
inputs - we must now perform J LU decompositions. (The stronger dependence on number of inputs 
is perhaps to be expected given that our formulation is based on the sensitivity function.) In the case of 
iterative solution methods it is more difficult to amortize additional right hand sides, and thus the situation 
is less encouraging. Finally, we remark that even with the adjoint formulation the bounds require additional 
solves for multiple inputs; in essence, we obtain a bound on each individual sensitivity derivative, and thus 
the usual economies of scale do not apply. 

6. Numerical Results. We now present our numerical results. In all the examples we shall set a = 0, 
p = 10, and v = 1; for / = 0 the solution is a boundary layer of thickness 0{yj p = 0.1) near the right 
hand boundary, x = 1. In the first test case we look at variations in v\ in the second test case we look at 
variations in p\ and in the third test case we look at variations in /. (In the interest of brevity we do not 
present our results for variations in a; no surprises are encountered.) We shall consider only Alternative 
A for the £y — £ 3 M splitting, as rather extensive tests indicate that the bounds generated by Alternative 
A and Alternative B are effectively indistinguishable. In all cases we take h = .001 for the fine mesh, and 
h < H < i7 max for the coarse mesh, where i7 ma x = .025. We shall present our results in the form of s l H /s f h , 


u 



s'/s' h , s f _/s f h , and s'+/s r h as a function of H. These ratios are represented on the plots as follows. 


Legend: s% B /s h = O, s% B /s h = □, s H /s h = x, s^/s h = A. 


f 


H /o h _ 


j c h 


In each of the three test cases we shall consider three different output functionals £(v): the mean over 
the domain, 


( 6 . 1 ) 


£ x (u ) = f vdx , 

Jo 


which we shall denote the “mean” output; the value at a particular point in the domain (within the boundary 
layer), x — 0.95, 


( 6 . 2 ) 


£ 2 {v) = u(.95) , 


which we shall call the “point evaluation” output; and the flux at x = 1, 
(6.3) £ 3 (u) = a(v, x) - (/, x ) , 


which we shall denote the “flux” output. As regards the flux output, it is readily shown by integration by 
parts that, for u sufficiently smooth, P{u) = ^u x (l); the advantage of £ 3 of (6.3) over the more obvious 
representation £{v) = vv x {\) is that the former is a bounded functional while the latter is not. Boundedness 
(or equivalently, continuity) of the output functional (i) ensures that the adjoint problem is well-posed, and 
(ii) greatly improves the convergence rate of the bounds. Note that the other two outputs are also bounded: 
the mean output is bounded in all space dimensions; the point evaluation output is bounded only in one 
space dimension. 

Proceeding with the numerical tests, our first test case is a = 0, p = 10, i/ = /3, / = 0, and (3 — 1; we 
take <t= 1, which ensures coercivity. We present in Figures 6.1, 6.2, and 6.3 on page 15 the bound results 
for the mean, point evaluation, and flux outputs, respectively. As expected, we obtain bounds; the bounds 
converge as 0(H 2 )\ and the bounds are reasonably accurate even on the coarsest meshes. For the mean and 
point evaluation outputs s' H and s' are considerably more accurate than the bounds s'_ , ; however for the 

point evaluation output, the effectiveness of the estimator is, in fact, quite good. In any event, we emphasize 
that the bounds provide certainty that can not be extracted from cither s' H or s'. 

In our second test case we choose a = 0, p = /?, v = 1, / = 0, and /3 = 10; we again take <7=1. We 
present in Figures 6.4, 6.5, and 6.6 on page 16 the bound results for the mean, point evaluation, and flux 
outputs, respectively. The results are very similar to those for our first test case — perhaps not surprising 
since this second test case is, in effect, a rescaling of the first test case. Note, however, that the numerical 
treatment of the two test cases is quite different: in the first case bounds are guaranteed for all H ; in the 
second case bounds are assured only for H sufficiently small. The numerical results of Figures 6.4, 6.5, 
and 6.6 demonstrate that, in fact, bounds are obtained even on the coarsest mesh — that is, the U L 2 — H 1 
assumption” is robust. 

In our third test case we choose a = 0, p — 10, v = 1, / = sin(27r/?x) , and ft = 5. We present in Figures 
6.7, 6.8, and 6.9 on page 17 the bounds results for the mean, point evaluation, and flux outputs, respectively, 
for a = 1. We present in Figures 6.10, 6.11, and 6.12 on page 18 the bound results for the mean, point 
evaluation, and flux outputs, respectively, but now for a very large (a = 100) effectively infinite (note 
there is no coercivity limit on a since v' = 0 for this test case). Comparison of Figures 6.7, 6.8, and 6.9 with 
Figures 6.10, 6.11, and 6.12 clearly demonstrates the important role that scaling parameter optimization 



can play in achieving sharp bounds. In this particular example a — ► oo is easily identified as optimal: in 
the limit a — ► oc our general formulation reduces to the “direct” procedure to which we alluded in Section 
2. The development of effective scaling parameter optimization procedures for more general situations is a 
topic for future work. 

In closing, we simply remark that sensitivity derivatives arise with increasing frequency in many different 
design and optimization contexts. Both certainty and efficiency are currently compromised by the lack of 
quantitative, relevant error estimation procedures; the techniques presented and illustrated in this paper 
represent an important first step in addressing this problem. 
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Fig. 6.6. Results for the flux functional and p = & at (3 = 1. 
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